Add SQLite client and converter #424

criccomini · 2024-02-01T20:09:29Z

Recap can now read SQLite schemas as Recap types. SQLite's schema system is
somewhat strange. Some notes:

Any column can store any type.
SQLite has 5 storage classes (null, int, real, text, blob).
STRICT forces column types to be the storage types.
non-STRICT tables allow any strings for column types.
non-STRICT column types are hints to coerce types as they're written to disk.
Parenthesis in types (e.g. DOUBLE(6, 2)) are ignored by SQLite.

See https://www.sqlite.org/datatype3.html#storage_classes_and_datatypes for more
details.

With all of this in mind, Recap's SQLiteConverter works according to SQLite's
affinity rules. This means:

Unknown types are treated as "ANY", which is a union of all storage types.
SQLiteConverter pays attention to precision/scale for REAL, etc.
SQLiteConverter pays attention to lengths for VARCHAR(255), etc.
SQLiteConverter treats date, datetime, time, and timestamp as ANY types.

Closes #418

There was a bug in `make_nullable` that caused UnionTypes to get double nested. I've updated the code to keep the UnionType flat. The code now adds a NullType if one isn't in the `types` attribute. It also sets `default` appropriate.

mjperrone · 2024-02-02T17:52:10Z

non-STRICT column types are hints to coerce types as they're written to disk.

That's interesting. That means the types are different on write vs read. You can write a "5" to a table but then get a 5 back on query. I haven't thought through all the implications of that yet, but seems worth pondering.

This reminds me of Kysely's Generated label. Kysely is a type safe query builder for typescript. It lets you define a typescript interface representing a table, then you can use it to have type safety on insert and select. For columns with default values (as in you wrote null but get a value out of it) it offers Generated, and the TS type on insert is optional, but on select it's required.

criccomini · 2024-02-02T18:28:31Z

Yea, gotta say I'm not a fan of SQLite's type system. And yes, you're correct, you can write "5" and get back 5. And this coercion is based on the type of the column, which can be anything! You can make the type NOT AN INTEGER, then write "5", and get back 5. 😭

criccomini · 2024-02-02T18:30:17Z

That said, for the happy path, I think my converter should work just fine. The only caveat is date types are treated as "any" types. This is actually how SQLite handles it. I'm open to extending the converter to handle dates in a custom way, but would rather keep that as a separate PR.

Recap can now read SQLite schemas as Recap types. SQLite's schema system is somewhat strange. Some notes: 1. Any column can store any type. 2. SQLite has 5 storage classes (null, int, real, text, blob). 3. STRICT forces column types to be the storage types. 4. non-STRICT tables allow any strings for column types. 5. non-STRICT column types are hints to coerce types as they're written to disk. 6. Parenthesis in types (e.g. DOUBLE(6, 2)) are ignored by SQLite. See https://www.sqlite.org/datatype3.html#storage_classes_and_datatypes for more details. With all of this in mind, Recap's SQLiteConverter works according to SQLite's affinity rules. This means: 1. Unknown types are treated as "ANY", which is a union of all storage types. 2. SQLiteConverter pays attention to precision/scale for REAL, etc. 3. SQLiteConverter pays attention to lengths for VARCHAR(255), etc. 4. SQLiteConverter treats date, datetime, time, and timestamp as ANY types. Closes #418

criccomini · 2024-02-02T22:45:29Z

@mjperrone This is ready for a review. 😄

mjperrone

This is great! BTW the only info I got about potential use cases is that it would be v3.

mjperrone · 2024-02-05T17:30:54Z

recap/clients/sqlite.py

+
+    def ls(self) -> list[str]:
+        cursor = self.connection.cursor()
+        cursor.execute("SELECT name FROM sqlite_schema WHERE type='table'")


I could imagine also wanting to capture schema of views. Doesn't seem necessary for the first go at this.

Ohh, good point. I'll open a follow-on GH issue for that. I hope it's as easy as type in ('table', 'view') 😅

mjperrone · 2024-02-05T17:32:01Z

recap/clients/sqlite.py

+            row = self.add_information_schema(row)
+            row = self.add_information_schema(row)


Why is this duplicated?

Suggested change

row = self.add_information_schema(row)

row = self.add_information_schema(row)

row = self.add_information_schema(row)

Good catch. I had two different functions but merged them into one i_s function..

mjperrone · 2024-02-05T17:32:18Z

recap/clients/sqlite.py

+        rows = []
+
+        for row_cells in cursor.fetchall():
+            row = dict(zip(names, row_cells))


mjperrone · 2024-02-05T17:38:41Z

recap/clients/sqlite.py

+        }
+
+        # Extract precision, scale, and octet length.
+        numeric_pattern = re_compile(r"(\w+)\((\d+)(?:,\s*(\d+))?\)")


GPT helped me understand this matches strings that look like function calls. Might be helpful to point to some documentation on what TYPE can look plike

Added docs.

mjperrone · 2024-02-05T17:43:02Z

recap/clients/sqlite.py

+        cursor.execute(
+            "SELECT name FROM sqlite_master WHERE type='table' AND name=?", (table,)
+        )
+        return bool(cursor.fetchone())


This confuses me since above I saw row[0] for row in cursor.fetchall(). If there is only one element SELECTed, is a tuple returned for that row, or just the value?

The content isn't as important as whether the row exists or not. I'm doing the _table_exists check to prevent SQL injection attacks since you can't %s or ? parameterize pragma calls. The SELECT is getting all rows that exactly match the table string. If at least one row exists, then we assume the table exists.

The row[0] for row in cursor.fetchall() is for listing all tables in the database. But it's not guaranteed that a user will call schema() with a table from the ls() command, so I wanted to guard against injection.

mjperrone · 2024-02-05T17:50:00Z

tests/unit/clients/test_sqlite.py

+                name TEXT NOT NULL,
+                short_name CHAR(20) NOT NULL,
+                variable_name VARCHAR(255) NOT NULL,
+                descriptive_text CLOB NOT NULL,


CLOB is a weird abbreviation

Indeed. And yet it's a thing. 😛

Python's `urlunparse` has this annoying behavior where it doesn't include double slashes if the netloc param is empty. This is annoying with URLs like `sqlite:///foo/bar/baz`. The parser would return `sqlite:/foo/bar/baz` in this scenario. I've updated the safe/unsafe methods in `RecapSettings` to account for this.

criccomini · 2024-02-05T22:34:45Z

Merged! It's now available in 0.11.0! 😄

https://pypi.org/project/recap-core/

mjperrone · 2024-02-06T01:14:55Z

Awesome! You also might want to update the homepage compatibility table to include this 🥳

criccomini · 2024-02-06T06:56:04Z

On the list for tomorrow! I want to add docs to the website first.

criccomini · 2024-02-06T17:56:50Z

Done! https://recap.build/docs/integrations/sqlite/

criccomini · 2024-02-29T23:17:57Z

Released in 0.12.0:

https://pypi.org/project/recap-core/0.12.0/

Don't nest UnionType when calling make_nullable

1bf854c

There was a bug in `make_nullable` that caused UnionTypes to get double nested. I've updated the code to keep the UnionType flat. The code now adds a NullType if one isn't in the `types` attribute. It also sets `default` appropriate.

criccomini mentioned this pull request Feb 1, 2024

Support SQLite converter into Recap #418

Closed

criccomini force-pushed the sqlite branch from 170a4a7 to 9974c03 Compare February 1, 2024 20:12

criccomini changed the title ~~Sqlite~~ Add SQLite client and converter Feb 1, 2024

criccomini force-pushed the sqlite branch 2 times, most recently from b5a525f to 7ba2151 Compare February 2, 2024 00:39

criccomini force-pushed the sqlite branch from 7ba2151 to 3b1f1c0 Compare February 2, 2024 22:40

mjperrone approved these changes Feb 5, 2024

View reviewed changes

criccomini force-pushed the sqlite branch from 3b1f1c0 to ac609f3 Compare February 5, 2024 22:04

criccomini merged commit 0ea697d into main Feb 5, 2024
3 checks passed

criccomini deleted the sqlite branch February 5, 2024 22:10

criccomini mentioned this pull request Feb 5, 2024

Add view support for SQLite client #425

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SQLite client and converter #424

Add SQLite client and converter #424

criccomini commented Feb 1, 2024 •

edited

Loading

mjperrone commented Feb 2, 2024

criccomini commented Feb 2, 2024

criccomini commented Feb 2, 2024

criccomini commented Feb 2, 2024

mjperrone left a comment

mjperrone Feb 5, 2024

criccomini Feb 5, 2024

criccomini Feb 5, 2024

mjperrone Feb 5, 2024

criccomini Feb 5, 2024

mjperrone Feb 5, 2024

mjperrone Feb 5, 2024

criccomini Feb 5, 2024

mjperrone Feb 5, 2024

criccomini Feb 5, 2024

mjperrone Feb 5, 2024

criccomini Feb 5, 2024

criccomini commented Feb 5, 2024

mjperrone commented Feb 6, 2024

criccomini commented Feb 6, 2024

criccomini commented Feb 6, 2024

criccomini commented Feb 29, 2024 •

edited

Loading

		row = self.add_information_schema(row)
		row = self.add_information_schema(row)

Add SQLite client and converter #424

Add SQLite client and converter #424

Conversation

criccomini commented Feb 1, 2024 • edited Loading

mjperrone commented Feb 2, 2024

criccomini commented Feb 2, 2024

criccomini commented Feb 2, 2024

criccomini commented Feb 2, 2024

mjperrone left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

criccomini commented Feb 5, 2024

mjperrone commented Feb 6, 2024

criccomini commented Feb 6, 2024

criccomini commented Feb 6, 2024

criccomini commented Feb 29, 2024 • edited Loading

criccomini commented Feb 1, 2024 •

edited

Loading

criccomini commented Feb 29, 2024 •

edited

Loading